Search CORE

122 research outputs found

Learning to Reconstruct Shapes from Unseen Classes

Author: Freeman William T.
Tenenbaum Joshua B.
Wu Jiajun
Zhang Chengkai
Zhang Xiuming
Zhang Zhoutong
Publication venue
Publication date: 28/12/2018
Field of study

From a single image, humans are able to perceive the full 3D shape of an object by exploiting learned shape priors from everyday life. Contemporary single-image 3D reconstruction algorithms aim to solve this task in a similar fashion, but often end up with priors that are highly biased by training classes. Here we present an algorithm, Generalizable Reconstruction (GenRe), designed to capture more generic, class-agnostic shape priors. We achieve this with an inference network and training procedure that combine 2.5D representations of visible surfaces (depth and silhouette), spherical shape representations of both visible and non-visible surfaces, and 3D voxel-based representations, in a principled manner that exploits the causal structure of how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe performs well on single-view shape reconstruction, and generalizes to diverse novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to this paper. Project page: http://genre.csail.mit.edu

arXiv.org e-Print Archive

DSpace@MIT

Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling

Author: Freeman William T.
Sun Xingyuan
Tenenbaum Joshua B.
Wu Jiajun
Xue Tianfan
Zhang Chengkai
Zhang Xiuming
Zhang Zhoutong
Publication venue
Publication date: 12/04/2018
Field of study

We study 3D shape modeling from a single image and make contributions to it in three aspects. First, we present Pix3D, a large-scale benchmark of diverse image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications in shape-related tasks including reconstruction, retrieval, viewpoint estimation, etc. Building such a large-scale dataset, however, is highly challenging; existing datasets either contain only synthetic data, or lack precise alignment between 2D images and 3D shapes, or only have a small number of images. Second, we calibrate the evaluation criteria for 3D shape reconstruction through behavioral studies, and use them to objectively and systematically benchmark cutting-edge reconstruction algorithms on Pix3D. Third, we design a novel model that simultaneously performs 3D reconstruction and pose estimation; our multi-task learning approach achieves state-of-the-art performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work. Project page: http://pix3d.csail.mit.ed

arXiv.org e-Print Archive

Crossref

DSpace@MIT

Visual Object Networks: Image Generation with Disentangled 3D Representation

Author: Freeman William T.
Tenenbaum Joshua B.
Torralba Antonio
Wu Jiajun
Zhang Chengkai
Zhang Zhoutong
Zhu Jun-Yan
Publication venue
Publication date: 06/12/2018
Field of study

Recent progress in deep generative models has led to tremendous breakthroughs in image generation. However, while existing models can synthesize photorealistic images, they lack an understanding of our underlying 3D world. We present a new generative model, Visual Object Networks (VON), synthesizing natural images of objects with a disentangled 3D representation. Inspired by classic graphics rendering pipelines, we unravel our image formation process into three conditionally independent factors---shape, viewpoint, and texture---and present an end-to-end adversarial learning framework that jointly models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes that are indistinguishable from real shapes. It then renders the object's 2.5D sketches (i.e., silhouette and depth map) from its shape under a sampled viewpoint. Finally, it learns to add realistic texture to these 2.5D sketches to generate natural images. The VON not only generates images that are more realistic than state-of-the-art 2D image synthesis methods, but also enables many 3D operations such as changing the viewpoint of a generated image, editing of shape and texture, linear interpolation in texture and shape space, and transferring appearance across different objects and viewpoints.Comment: NeurIPS 2018. Code: https://github.com/junyanz/VON Website: http://von.csail.mit.edu

arXiv.org e-Print Archive

DSpace@MIT

Height Information Aided 3D Real-Time Large-Scale Underground User Positioning

Author: Song Houbing
Tang Chengkai
Zhang Cunle
Zhang Lingling
Zhang Yi
Publication venue: Scholarly Commons
Publication date: 09/09/2022
Field of study

Due to the cost of inertial navigation and visual navigation equipment and lake of satellite navigation signals, they cannot be used in large‐scale underground mining environment. To solve this problem, this study proposes large‐scale underground 3D real‐time positioning method with seam height assistance. This method uses the ultrawide band positioning base station as the core and is combined with seam height information to build a factor graph confidence transfer model to realise3D positioning. The simulation results show that the proposed real‐time method is superior to the existing algorithms in positioning accuracy and can meet the needs of large‐scale underground users

Directory of Open Access Journals

Embry-Riddle Aeronautical University

Time Reversal Aided Bidirectional OFDM Underwater Cooperative Communication Algorithm with the Same Frequency Transmission

Author: Chengkai Tang
Houbing Song
Jianguo Huang
Lingling Zhang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2017
Field of study

Crossref

Time Reversal Aided Bidirectional OFDM Underwater Cooperative Communication Algorithm with the Same Frequency Transmission

Author: Huang Jianguo
Song Houbing
Tang Chengkai
Zhang Lingling
Publication venue: The Research Repository @ WVU
Publication date: 01/01/2017
Field of study

In underwater acoustic channel, signal transmission may experience significant latency and attenuation that would degrade the performance of underwater communication. The cooperative communication technique can solve it but the spectrum efficiency is lower than traditional underwater communication. So we proposed a time reversal aided bidirectional OFDM underwater cooperative communication algorithm. The algorithm allows all underwater sensor nodes to share the same uplink and downlink frequency simultaneously to improve the spectrum efficiency. Since the same frequency transmission would produce larger intersymbol interference, we adopted the time reversal method to degrade the multipath interference at first; then we utilized the self-information cancelation module to remove the self-signal of OFDM block because it is known for sensor nodes. In the simulation part, we compare our proposed algorithm with the existing underwater cooperative transmission algorithms in respect of bit error ratio, transmission rate, and computation. The results show that our proposed algorithm has double spectrum efficiency under the same bit error ratio and has the higher transmission rate than the other underwater communication methods

Directory of Open Access Journals

The Research Repository @ WVU (West Virginia University)

Subclass-balancing Contrastive Learning for Long-tailed Recognition

Author: Hou Chengkai
Wang Haonan
Zhang Jieyu
Zhou Tianyi
Publication venue
Publication date: 28/06/2023
Field of study

Long-tailed recognition with imbalanced class distribution naturally emerges in practical machine learning applications. Existing methods such as data reweighing, resampling, and supervised contrastive learning enforce the class balance with a price of introducing imbalance between instances of head class and tail class, which may ignore the underlying rich semantic substructures of the former and exaggerate the biases in the latter. We overcome these drawbacks by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that clusters each head class into multiple subclasses of similar sizes as the tail classes and enforce representations to capture the two-layer class hierarchy between the original classes and their subclasses. Since the clustering is conducted in the representation space and updated during the course of training, the subclass labels preserve the semantic substructures of head classes. Meanwhile, it does not overemphasize tail class samples, so each individual instance contribute to the representation learning equally. Hence, our method achieves both the instance- and subclass-balance, while the original class labels are also learned through contrastive learning among subclasses from different classes. We evaluate SBCL over a list of long-tailed benchmark datasets and it achieves the state-of-the-art performance. In addition, we present extensive analyses and ablation studies of SBCL to verify its advantages

arXiv.org e-Print Archive